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Halopiger goleamassiliensis strain IIH3 T sp. nov. is a novel, extremely halophilic archaeon 
within the genus Halopiger. This strain was isolated from an evaporitic sediment in El Golea 
Lake, Ghardaia region (Algeria). The type strain is strain IIH3 T . H. goleamassiliensis is moder- 
ately thermophilic, neutrophilic, non-motile and coccus-shaped. Here we describe the fea- 
tures of this organism, together with the complete genome sequence and annotation. The 
3,906,923 bp long genome contains 3,854 protein-encoding genes and 49 RNA genes (1 
gene is 1 6S rRNA, 1 gene is 23S rRNA, 3 genes are 5S rRNA, and 44 are tRNA genes). 



Introduction 

Halopiger goleamassiliensis sp. nov. strain IIH3 T 
[=KC 430940 =CSUR P3036 = DSM on-going de- 
posit) is the type strain of H .goleamassiliensis sp. 
nov. This organism is a Gram-negative, extremely 
halophilic, moderately thermophilic and strictly 
aerobic archaeon. It was isolated from an 
evaporitic sediment in El Golea Lake, Ghardaia re- 
gion (Algeria) as part of a project studying 
archaeal diversity in hypersaline Lakes of Algeria. 

The number of genera and species belonging to 
Halobacteha [Archaea, Euryarchaeota) has in- 
creased recently due to studies of several different 
hypersaline environments (thalassohaline and 
athalassohaline) combined with the use of differ- 
ent isolation media and culture conditions [1]. At 
the time of writing, the family Halobacteriaceae, 
the single family described within the order 
Halobacteriales, accommodated 40 recognized 
genera [2]. The genus Halopiger was proposed by 
Gutierrez et al. (2007) [3] and contains only three 
species, Halopiger xanaduensis isolated from the 
Shangmatala Lake (China) [3], Halopiger 
aswanensis isolated from a hypersaline soil in As- 
wan (Egypt) [4] and Halopiger salifodinae recently 
isolated from a salt mine in Kuche county, Xinjiang 
province, China [5]. So far, this genus is composed 
of strictly aerobic, Gram-negative, polymorphic 



and pigmented strains. We have recently used [6- 
18] a polyphasic approach for prokaryotic classifi- 
cation [19] that includes genomic data [20,21], 
MALDI-T0F spectra [22,23] and major phenotypic 
characteristics. 

Using this approach, we report here a summary 
classification and a set of features for Halopiger 
goleamassiliensis sp.nov. strain IIH3 T together 
with the description of the complete genomic se- 
quencing and annotation. These characteristics 
support the circumscription of the H. 
goleamassiliensis species. 

Classification and features 

H. goleamassiliensis was isolated from an 
evaporitic sediment of the hypersaline Lake El 
Golea in Ghardaia region of Algeria. The sediment 
sample (lg) was enriched in a liquid SG medium 
[24] containing ampicillin (100 u.g/mL) at 55°C on 
a rotary shaking platform (150 rpm) for 7 to 15 
days. Serial dilutions of enrichment cultures were 
plated on SG agar plates and incubated aerobically 
at 55°C. After 2 to 6 weeks of incubation, repre- 
sentative colonies were picked and maintained in 
the SG medium at 55°C. Strain IIH3 T (Table 1) was 
isolated in 2012 by cultivation in aerobic condi- 
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tions at 55°C and stored at -80 Q C with 25% (v/v) 
glycerol. 

Genomic DNA was extracted and purified using 
the Genomic DNA purification kit (MACHEREY- 
NAGEL) Hoerd, France. The 16S rRNA gene was 
amplified by PCR using the primers 21AF: 
TTCCGGTTGATCCTGCCGGA and RP2: 

ACGGCTACCTTGTTACGACTT. A total of 1,444 ba- 
ses were identified. The sequence was compared 
with available sequences in GenBank using a 
BLAST search [36]. The strain exhibited 96% nu- 



cleotide sequence similarities with Halopiger 
xanaduensis [3]. These values were lower than the 
98.7% 16S rRNA gene sequence threshold rec- 
ommended by Stackebrandt and Ebers to deline- 
ate a new species without carrying out DNA-DNA 
hybridization [37]. A phylogenetic tree (Figure 1) 
was constructed using the neighbor-joining meth- 
od with the MEGA 5 program package [38] after 
multiple alignments of the data using MUSCLE 
[39]. Evolutionary distances were calculated using 
the Tamura-Nei model [40]. 



Table 1. Classification and general features of Halopiger goleamassiliensis according to the MIGS recom- 
mendations [25]. 



MIGS ID 



Property 



Term 



Evidence code 3 







Domain Archaea 


TAS [26] 






Phylum Euryarchaeota 


TAS [27] 






Class Halobacteria 


TAS [28,29] 




Current classification 


Order Halobacteriales 


TAS [30-32] 






Family Halobacteriaceae 


TAS [33,34] 






Genus Halopiger 


TAS [3] 






Species Halopiger goleamassiliensis 


IDA 






Type strain IIH3 T 


IDA 




Gram stain 


Negative 


IDA 




Cell shape 


Coccus 


IDA 




Motility 


Non-motile 


IDA 




Sporulation 


None 


IDA 




Temperature range 


Thermophile, between 40°C and 60°C 


IDA 




Optimum temperature 


55°C 


IDA 


MIGS-6.3 


Salinity 


Halophile, 22.5%-25% (optimum) 


IDA 


MIGS-22 


Oxygen requirement 


Aerobic 


IDA 




Carbon source 


Sugar or amino acids 


IDA 




Energy metabolism 


Heterotrophic 


IDA 


MIGS-6 


Habitat 


Salt Lake sediment 


IDA 


MIGS-15 


Biotopic relationship 


Free living 


IDA 


MIGS-14 


Pathogenicity 


Non-pathogenic 


NAS 




Biosafety level 


1 


NAS 




Isolation 


Sediment of El Golea Lake 


IDA 


MIGS-4 


Geographic location 


Algeria 


IDA 


MIGS-5 


Isolation time 


2012 


IDA 


MIGS-4. 1 


Latitude 


30-34 N 


IDA 


MIGS-4.2 


Longitude 


002-52 E 


IDA 


MIGS-4.3 


Depth 


Surface 


IDA 


MIGS-4.4 


Altitude 


397 


IDA 



""Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report ex- 
ists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolat- 
ed sample, but based on a generally accepted property for the species, or anecdotal evidence). These evi- 
dence codes are from the Gene Ontology project [35]. If the evidence is IDA, then the property was direct- 
ly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. 
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Haloterrigena thermotolerans (JX982773) 
Natrinema gari (NR 028167) 
Natrialba aegyptia (NR 028176) 

Halostagnicola larsenii (NR 042452) 
Halovivax ruber (NR 042522) 
Natronococcus occultus (NR 028255) 



Natronolimnobius baerhuensis (NR 028161) 



Halopiger salifodinae (JX014296) 



Halopiger goleamassiliensis (KC430940) 

Halopiger xanaduensis (NR 042521) 
Halopiger aswanensis (AF333759) 
Halorhabdus utahensis (NR 028172) 



Halomicrobium mukohataei (NR 044337) 

Haloplanus natans (NR 043803) 

- Halogeometricum borinquense (NR 028170) 
Haloferax mediterranei (NR 028212) 
Methanospirillum hungatei (NR 074177) 



0,02 



Figure 1. Neighbor-joining phylogenetic tree based on 16S rRNA gene sequence comparisons, showing the 
position of strain IIH3 T and other related haloarchaeal species. GenBank accession numbers are indicated in 
parentheses. Sequences were aligned using MUSCLE, and phylogenetic inferences obtained using the MEGA 
software. Numbers at the nodes are from a bootstrap analysis done using 1,000 replicates to generate a ma- 
jority consensus tree. Methanospirillum hungatei was used as outgroup. 



Phenotypic characterization was carried out ac- 
cording to the recommended minimal standards 
for the description of new taxa in the order 
Halobacteriales [41]. Table 2 summarizes the dif- 
ferential phenotypic characteristics of H. 
goleamassiliensis sp. nov. IIH3 T , H. xanaduensis SH- 
6T, H. aswanensis 56 T and H. salifodinae 
KCY076B2 T . Different growth temperatures (30, 
37, 40, 50, 55, 60°C), pH values (5, 6, 7, 7.5, 8, 8.5, 
9, 10, 11, 12) and NaCl concentrations (0, 10, 12, 
15, 20, 22.5, 25, 30% W/V] were tested on strain 
IIH3 T . Cell growth was observed between 40°C 
and 60°C (optimum at 55°C), between 15% and 
30% NaCl (optimum at 22.5-25 % NaCl) and at 7 
to 11 pH values (optimum at pH 8). 

Under optimal growth conditions on SG agar me- 
dium and after incubation for 15-20 days at 55°C, 
colonies were salmon pigmented, circular with a 
diameter of 1-2 mm. Cell morphology and motility 
were examined by using light microscopy and 



phase-contrast microscopy. Gram staining was 
performed using samples fixed with acetic acid, as 
described by Dussault in 1955 [42]. Cells are 
Gram-negative, cocci (Figure 2) measuring 0.8-1.5 
|im in diameter (Figure 3). Motility and spores or 
capsules were not observed. All the following bio- 
chemical and nutritional tests were realized in 
duplicate. Strain IIH3 T was found to be oxidase- 
and catalase- positive. The strain is extremely 
halophilic and cell lysis is observed in distilled wa- 
ter. It is a strictly aerobic organism and anaerobic 
growth does not occur even in the presence of 
KNO3 or arginine. Neither magnesium nor amino 
acids are required for growth. Tween 80, gelatin, 
and lipids from egg yolk are hydrolysed, whereas 
urea, starch, casein, and phosphatase are not. Pro- 
duction of indole and methyl red, Voges- 
Proskauer and Simmons' citrate tests are negative. 
H2S is not produced from cysteine. 
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Figure 2. Gram stain of Halopiger goleamassiliensis strain IIH3 T . 




V BUB 

Figure 3. Transmission electron micrograph of H. goleamassiliensis strain 
IIH3 T , using a Morgani 268D (Philips) at an operating voltage of 60kV. The 
scale bar represents 1 urn. 



Utilization of carbohydrates and other compounds 
as sole carbon sources and acid production from 
these compounds were determined as described 
by Oren [41]. Several sugars and amino acids can 
serve as sole carbon and energy sources (Table 2). 
Antibiotic sensitivity tests were determined on SG 
medium agar plates with antibiotic discs. Strain 
IIH3 T is susceptible to bacitracin (10 ug), 



novobiocin (30 ug), streptomycin (10 |ig) and 
sulfamethoxazole (25 u.g), but resistant to ampicil- 
lin (10 u.g), cephalothin (30 u.g), chloramphenicol 
(30 |ig), erythromycin (15 u.g), gentamicin (10 u.g), 
kanamycin (30 u.g), nalidixic acid (30 |ig), penicil- 
lin G (10 |ig), rifampicin (30 |ig), tetracycline (30 
|ig), and vancomycin (30 ug). 
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Table 2. Differential phenotypic characteristics between strain IIH3 T and related species 



Characteristic H. goleamassiliensis H. xanaduensis H. aswanensis H. salifodinae 



Cell morphology 


coccus 


pleomorphic 


pleomorphic 


pleomorphic rc 


Cell diameter (pm) 


U.O- I .J 


U.j- I .VXj.V- I j.U 


i .z j-d. juxu.d— u.y 


IN U 


r igmenianon 


saimon 


red 


pink 


cream 


UXyUcI] [fcrUUli trllltrllL 


SLiicLiy aeiouic 


C+l"l^+l\/ T Of"*"* l~\ I /~ 

SLiicLiy aeiouic 


cf"»" I r~f"l \ / T o»"f"\l~\ i 

SLiicLiy aeiuuic 


SLiicLiy aeiouic 


Gram stain 


negative 


negative 


negative 


negative 


rVlTil"! nnrro 1 0/„ \a//\/i 
iNd^l Idll^t: V /OfYv/V) 


l J-JU 


l J-JU 


l U- JU 


11 ^1 

l l -s> I 


NaCI optimum (%,w/v) 


n n r n r 
ZZ.J-ZJ 




n n r n r 
ZZ.J-Z J 


1 7 on 

I /-ZU 


Temperature range (°C) 


/in r n 
40-bO 


ZO-4j 


a r\ it n 
4U-5U 


i it en. 

zb-jU 


i eniperaiure (jpuniuin v 


c c: 
jj 


^7 

J / 






pH range 


/-II 


£ 11 
D - I I 


£ Q 0 

o-y.z 


O-O 


pH optimum 


Q Q C 

0-0.5 


7 C Q 
/. J-O 




/.U 


Motility 


non-motile 


non-motile 


motile 


non-motile 


Catalase 


+ 


+ 


+ 


+ 


hydrolysis of 










Starch 






+ 




Tween 80 


+ 


+ 


+ 




Casein 








INU 


Ljeiann 


+ 


+ 






Lipids from egg yolk 


+ 


IN U 




IN U 


utilization of 










D-Glucose 


+ 


+ 


+ 


+ 


Galactose 


+ 


+ 


INU 




U-Aylose 


+ 


+ 


+ 




Lactose 


+ 








Fructose 






+ 




Starch 






+ 


+ 


Mannose 


+ 




INU 


+ 


D-Ribose 


+ 




INU 




JULI U3C 




ND 


r 


ND 


Rhamnose 


+ 


INU 


INU 




jvian n i loi 






Kl n 
IN I J 


IN VJ 


Citrate 






ND 




L-Arginine 










Indole production 






+ 




Urease 




+ 






H2S production 






+ 


+ 



Strains: /-/. goleamassiliensis sp. nov. IIH3 T ; /-/. xanaduensis SH-6T; 3, H. aswanensis; H. salifodinae KCY076B2 1 . 
+: Positive result, -: Negative result, ND: Not Determined 



Matrix-assisted laser desorption/ionization time- 
of-flight mass spectrometry (MALDI-TOF MS is 
considered a reliable and rapid identification 
method for extremophilic prokaryotes [22,23] and 
it is used in the present study to characterize the 
strain IIH3 T as previously described [6-18]. A pi- 
pette tip was used to pick one isolated archaeal 



colony from a culture agar plate, and to spread it 
as a thin film on a MTP 384 MALDI-TOF tar-get 
plate (Bruker Daltonics, Leipzig, Germany]. The 
colonies from strain IIH3 T and from other species 
of archaea were spotted in triplicate. After air- 
drying, 1.5 pi of matrix solution (a saturated solu- 
tion of a-cyano-4-hydroxycinnaminic acid [CHCA] 
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in 50% aqueous acetonitrile containing 2.5% 
trifluoroacetic acid) per spot was applied and al- 
lowed to dry for five minutes. 

Mass spectrometric measurements were per- 
formed with a Microflex spectrometer (Bruker). 
Spectra were recorded in the positive linear mode 
for the mass range of 2000 to 20,000 DA. The ac- 
celeration voltage was 20 kV. The time of acquisi- 
tion was between 30 seconds and 1 minute per 
spot. Spectra were collected as a sum of 240 shots 
across a spot. Preprocessing and identification 
steps were performed using the manufacturer's 
parameters. The IIH3 T spectrum (Figure 4) was 
imported into the MALDI BioTyper software (ver- 
sion 2.0, Bruker) and analyzed by standard pat- 
tern matching (with default parameter settings) 
against the spectra of Haloferax mediterranei, 



Natrinema gari, Natrinema pallidum, 
Haloterrigena thermotolerans, Haloterrigena. sp, 
Halogeometricum. sp, Haloarcua. sp and Halopiger. 
sp used as reference data in the BioTyper database 
(Figure 5). 

A score enabled the identification, or not, from the 
tested species: a score > 2.3 with a validly pub- 
lished species enabled the identification at the 
species level, a score > 1.7 but < 2 enabled the 
identification at the genus level; and a score < 1.7 
did not enable any identification. For strain IIH3 T , 
none of the obtained scores was > 1, thus suggest- 
ing that our isolate was not a member of a known 
species. We added the spectrum from strain IIH3 T 
to our database for future reference. Figure 5 
shows the MALDI-TOF MS spectrum differences 
between H. goleamassiliensis and other Archaea. 



3 X10 4 

d 3.0- 




Figure 4: Reference mass spectrum from H. goleamassiliensis strain IIH3 T . Spectra from 12 individual colo- 
nies were compared and a reference spectrum was generated. 



Genome sequencing information 

Genome project history 

The organism was selected for sequencing on the 
basis of its phylogenetic position and 16S rRNA 
similarity to other members of the genus 
Halopiger, and as part of a study of archaeal diver- 
sity in hypersaline lakes of Algeria. It is the second 
genome of a Halopiger species and the first se- 



quenced genome of H. goleamassiliensis sp. nov. 
The EMBL accession number is CBMB010000001- 
CBMB010000011 and it consists of 3 scaffolds 
(HG315690-HG315692). A summary of the project 
information (PRJEB1780) and its association with 
MIGS version 2.0 recommendations [27] is shown 
in Table 3. 
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Haloarcula. sp 



Halopiger. sp 



Haloterrigena . sp 



Hctegeomstncu/n . sp 



H. mediterranei 



H. goleamassiliensis 



Natrinema gari 



Natrinema pallidum 



H. thermotolerans 
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*: 
BO 

ra 
so 
so 

40 

30 
20 



Figure 5. Gel view comparing the H. goleamassiliensis strain IIH3 T spectrum with those of other archaea. The 
Gel View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The x-axis 
records the m/z value. The left y-axis displays the running spectrum number originating from subsequent 
spectra loading. The peak intensity is expressed by the gray scale intensity. The scale shown on the right y- 
axis links the color to the peak intensity in arbitrary units. 



Growth conditions and DNA isolation 

H. goleamassiliensis sp.nov. strain IIH3 T (= CSUR 
P3036 =DSM on-going deposit) was grown in SG 
medium at 55°C in aerobic condition. DNA was 
isolated and purified using the Genomic DNA puri- 
fication kit, NucleoSpin Tissue procedure 
(MACHEREY-NAGEL) following the standard pro- 
tocol as recommended by the manufacturer. The 
quality of the DNA was checked on an agarose gel 
(0.8%) stained with SYBR safe. The yield and the 
concentration were measured by the Quant-it 
Picogreen Kit (Invitrogen) on the Genios Tecan 
Fluorometer at 33.1 ng/uL. 

Genome sequencing and assembly 

A 5 kb paired-end sequencing strategy (Roche, 
Meylan, France) was used. This project was loaded 
on a 1/4 region on PTP Picotiterplate (Roche). 
Three |ig of DNA was mechanically fragmented on 
the Covaris device (KBioScience-LGC Genomics, 
Teddington, UK) using miniTUBE-Red 5Kb. The 
DNA fragmentation was visualized through an Ag- 
ilent 2100 BioAnalyzer on a DNA labchip 7500 
with an optimal size of 4.7 kb. The library was 
constructed according to the 454 GS FLX Titanium 



paired end-protocol. After PCR amplification 
through 17 cycles followed by double size selec- 
tion, the single stranded paired-end library was 
then loaded on a DNA labchip RNA pico 6000 on 
the BioAnalyzer. The pattern showed an optimum 
at 480 bp and the concentration was quantified on 
a Genios Tecan fluorometer at 642 pg/|iL. The 
concentration equivalence of the library was cal- 
culated at 10 8 molecules/|iL. The library was 
stored at -20°C until further use, and amplified in 
stored at -20°C until further use, and amplified in 

2 emPCR reactions at 0.25 cpb, in 2 emPCR at 0.5 
cpb and in 2 emPCR at 1 cpb with the GS Titanium 
SV emPCR Kit (Lib-L) v2 (Roche). The yield of the 

3 types of paired-end emPCR reactions was 
3.68%, 8.05% and 10.69% respectively, in the 
quality range of 5 to 20% expected from the Roche 
procedure. These emPCR were pooled. Both li- 
braries were loaded onto GS Titanium 
PicoTiterPlates (PTP Kit 70x75, Roche) and 
pyrosequenced with the GS Titanium Sequencing 
Kit XLR70 (Roche). The run was performed over- 
night and then analyzed on the cluster through the 
gsRunBrowser and Newbler assembler (Roche). 
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Table 3. Project information 



/VI 1 vjT> 1 YJ 


r ropeny 


Term 


/V\IvjO-J I 


Finishing quality 


High-quality draft 


Mlluo-zo 


Libraries used 


Paired-end 5 kb library 


MIGS-29 


Sequencing platforms 


454 GS FLX Titanium 


MIGS-31.2 


Fold coverage 


21. 6x 


MIGS-30 


Assemblers 


Newbler version 2.5.3 


MIGS-32 


Gene calling method 


Prodigal 




EMBL ID 


CBMB01 0000001- 




CBMB01 0000011 




EMBL Date of Release 


June 18, 2018 




Project relevance 


Study of the archaeal diversity in 
hypersaline lakes of Algeria 



A total of 271,702 filter-passed wells were ob- 
tained and generated 84.39 Mb with an average 
length of 325 bp. The passed filter sequences were 
assembled using Newbler with 90% identity and 
40 bp overlap. The final assembly contained 12 
contigs (11 large contigs >1500 bp) arranged in 3 
scaffolds and generated a genome size of 3.9 Mb, 
which corresponds to a coverage of 21. 6* genome 
equivalent. 

Genome annotation 

Open Reading Frames (ORFs) were predicted us- 
ing prodigal with default parameters [43]. ORFs 
spanning a sequencing gap region were excluded. 
Assessment of protein function was obtained by 
comparing the predicted protein sequences with 
sequences in the GenBank [44] and the Clusters of 
Orthologous Groups (COG) databases using 
BLASTP. RNAmmer [45] and tRNAscan-SE 1.21 
[46] were used for identifying the rRNAs and 
tRNAs, respectively. SignalP [47] and TMHMM 
[48] were used to predict signal peptides and 
transmembrane helices, respectively. For align- 
ment lengths greater than 80 amino acids, ORFans 
were identified if their BLASTP E-value was lower 
than le-03. An E-value of le-05 was used if align- 
ment lengths were smaller than 80 amino acids. 
DNA Plotter [49] was used for visualization of ge- 
nomic features and Artemis [50] was used for data 
management. The mean level of nucleotide se- 
quence similarity was estimated at the genome 
level between H. goleamassiliensis and 5 other 
members of the Halobacteriaceae family (Table 6), 
by BLASTN comparison of orthologous ORFs in 
pairwise genomes. Orthologous proteins were de- 
tected using the Proteinortho software using the 
following parameters: e-value le-05, 30% identi- 



ty, 50% coverage and 50% of algebraic connectivi- 
ty[51]. 

Genome properties 

The genome is 3,906,923 bp long and displays a 
G+C content of 66.06%. (Table 4, Figure 6) It is 
composed of 12 contigs (11 large contigs >1,500 
bp) arranged into 3 scaffolds. Of the 3,903 pre- 
dicted genes, 3,854 were protein-coding genes 
(COG), and 49 were RNAs (1 gene is 16S rRNA, 1 
gene is 23S rRNA, 3 genes are 5S rRNA, and 44 are 
tRNA genes). A total of 2,359 genes (61.21%) 
were assigned a putative function (by COG or by 
NR BLAST) and 188 genes were identified as 
ORFans (4.88%). The remaining genes were anno- 
tated as hypothetical proteins (1059 genes = 
27.48%). The distribution of genes into COG func- 
tional categories is presented in Table 4. The 
properties and the statistics of the genome are 
summarized in Tables 4 and 5. 

Comparison with other genomes 

Currently, only one genome from Halopiger spe- 
cies is available. Here, we compared the genome of 
H. goleamassiliensis strain IIH3 T with those of H. 
xanaduensis strain SH-6, Halalkalicoccus jeotgali 
strain B3, Natronomonas pharaonis strain DSM 
2160, Haloterrigena turkmenica strain DSM 5511 
and Natrialba magadii strain ATCC 43099. The 
genome of H. goleamassiliensis (3.90 Mb) is larger 
than that of Halalkalicoccus jeotgali and 
Natronomonas pharaonis (3.69 and 2.75 Mb, re- 
spectively) but of a smaller size than H. 
xanaduensis, Natrialba magadii and Haloterrigena 
turkmenica (4.35, 4.44 and 5.44 Mb respectively). 
The GC% content of H. goleamassiliensis (66.06%) 
is higher than that of H. xanaduensis (65.2%), 
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Haloterrigena turkmenica (64.26%), 

Natronomonas pharaonis (63.1%), Halalkalicoccus 
jeotgali (62.5%) and Natrialba magadii (61.1%). 
H. goleamassiliensis has more predicted protein- 
coding genes (3,854) than Haloterrigena 
turkmenica, H. xanaduensis Natrialba magadii, 
Halalkalicoccus jeotgali and Natronomonas 
pharaonis (3,739, 3,588, 3,559, 3,035 and 2,659 



respectively). In addition, H. goleamasiliensis 
shared a mean genomic sequence similarity of 
67.60, 78.21, 76.27, 68.70 and 78.62% with 

Natronomonas pharaonis, Haloterrigena 
turkmenica, Natrialba magadii, Halalkalicoccus 
jeotgali and Halopiger xanaduensis respectively 
(Table 6). 




Figure 6. Graphical circular map of the H. goleamassiliensis IIH3 T genome. From the outside in: The first circle indi- 
cates the scaffolds, the next two circles show open reading frames oriented in the forward and reverse (colored by 
COG categories) directions, respectively. The fourth circle displays the rRNA gene operon (red) and tRNA genes 
(green). The fifth circle shows the G+C% content plot. The innermost circle shows the GC skew, purple and olive in- 
dicating negative and positive values, respectively. 
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Table 4. Nucleotide content and gene count levels of the genome 



AlIrlDUie 


Value 


O/ ^vf tr»t-i | a 

/o OI lOlal 


Genome size (bp) 


3,906,923 


100.00 


DNA G+C content (bp) 


2,581,064 


66.06 


DNA coding region (bp) 


3,359,291 


85.98 


Total genes 


3,903 


100.00 


kina genes 


A Q 


I .ZD 


Protein-coding genes 


3,854 


98.74 


Genes with function prediction 


2,359 


61.21 


Genes assigned to COGs 


2,446 


63.47 


Genes with peptide signals 


320 


8.30 


Genes with transmembrane helices 


906 


23.51 



"The total is based on either the size of the genome in base pairs or the total number 
of protein coding genes in the annotated genome. 



Table 5. Number of genes associated with the 25 general COG functional categories 



Code Value % of total 3 Description 



J 


1 66 


4.31 


Translation 


A 
A 


1 


0.003 


kina processing anu mouuicaLion 




157 


4.07 


i [ aiiscr ipuoii 


L 


113 


2.93 


Replication, recombination and repair 


B 


3 


0.08 


Chromatin structure and dynamics 


D 


18 


0.47 


Cell cycle control, mitosis and meiosis 


Y 


0 


0 


Nuclear structure 


V 


46 


1.19 


Defense mechanisms 


T 


128 


3.32 


Signal transduction mechanisms 


M 


74 


1.92 


Cell wall/membrane biogenesis 


N 


51 


1.32 


Cell motility 


Z 


0 


0 


Cytoskeleton 


W 


0 


0 


Extracellular structures 


u 


27 


0.70 


Intracellular trafficking and secretion 


o 


114 


2.96 


Post-translational modification, protein turnover, chaper- 


ones 


c 


168 


4.36 


Energy production and conversion 


G 


122 


3.17 


Carbohydrate transport and metabolism 


E 


266 


6.90 


Amino acid transport and metabolism 


F 


70 


1.82 


Nucleotide transport and metabolism 


H 


131 


3.40 


Coenzyme transport and metabolism 


I 


107 


2.78 


Lipid transport and metabolism 


P 


176 


4.57 


Inorganic ion transport and metabolism 


Q 


82 


2.13 


Secondary metabolites biosynthesis, transport and catabo- 


lism 


R 


510 


13.23 


General function prediction only 


S 


248 


6.43 


Function unknown 




1408 


36.53 


Not in COGs 
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a The total is based on the total number of protein coding genes in the annotated genome. 

Table 6. Orthologous gene comparison and average nucleotide identity of H. goleamassiliensis with other compared 
genomes (upper right, numbers of orthologous genes; lower left, mean nucleotide identities of orthologous genes). 
Bold numbers indicate the numbers of genes or each genome. 



Species (accession number) 



H. 

goleamassilien 
sis 



N. 

pharaonis 



H. 

turkmenica 



N. magadii 



H. 

H. jeotgali xanadue 
nsis 



Halopiger goleamassiliensis (PRJEB1 780) 


3854 


1415 


2036 


1859 


1542 


2103 


Natronomonas pharaonis (NC_007426) 


67.60 


2659 


1393 


1321 


1254 


1381 


Haloterrigena turkmenica (NC_013743) 


78.21 


67.81 


3739 


1765 


1559 


2057 


Natrialba maga c//7'(NC_01 3922) 


76.27 


66.85 


76.83 


3559 


1442 


1828 


Halalkalicoccus jeotgali (NC_014297) 


68.70 


67.76 


68.97 


67.55 


3035 


1589 


Halopiger xanaduensis (NC_01 5666) 


78.62 


67.52 


79.73 


76.98 


68.83 


3588 



Conclusion 

On the basis of phenotypic, phylogenetic and ge- 
nomic analyses, we formally propose the creation 
of Halopiger goleamassiliensis sp. nov. that con- 
tains the strain IIH3 T . This archaeal strain has 
been found in Algeria. 

Description of Halopiger goleamassiliensis 
sp. nov. 

Halopiger goleamassiliensis (go.le'a. ma. si. li. 
en'sis. L. gen. masc. n. goleamassiliensis from the 
combination of El Golea, the Algerian region 
where the strain was isolated, and massiliensis, of 
Massilia, the Latin name of Marseille where the 
strain was sequenced). It has been isolated from 
an evaporitic sediment in El Golea Lake, Algeria. 

Colonies were smooth, salmon-pigmented and 
small with 1 to 2 mm in diameter under optimal 
growth conditions. Strain is strictly aerobic, ex- 
tremely halophilic and moderately thermophilic 
archeon. Growth occurs at NaCl concentrations of 
15-30%, at pH values in the range 7-11, and with- 
in the temperature range 40-60 °C. Optimal NaCl 
concentration, pH and temperature for growth are 
22.5-25%, 8.0 and 55 °C, respectively. Magnesium 
is not required for growth. Cells are coccus- 
shaped (0.8-1.5 um), Gram-negative, non-motile 
and lyse in distilled water. Cells are positive for 
catalase, oxidase and lysine decarboxylase pro- 
duction and negative for urease, arginine 
dihydrolase, ornithine decarboxylase, 

tryptophanase, phosphatase, (B-galactosidase, D- 
mannitol, sacharose, starch, dextrose, and D- 



fructose fermentation. The following substrates 
are utilized as single carbon and energy sources 
for growth: pyruvate, D-glucose, D-mannose, D- 
ribose, D-xylose, maltose, sucrose, lactose, 
casamino acids, bacto-peptone, bacto-tryptone, 
and yeast extract. Tween 80, gelatin, and lipids 
from egg yolk are hydrolysed, whereas urea, 
starch, and casein are not. Methyl red, Voges- 
Proskauer, Simmons' citrate tests, and H2S pro- 
duction are negative. 

Cells are susceptible to bacitracin, novobiocin, 
streptomycin, and sulfamethoxazole but resistant 
to ampicillin, cephalothin, chloramphenicol, eryth- 
romycin, gentamicin, kanamycin, nalidixic acid, 
penicillin G, rifampicin, tetracycline, and 
vancomycin. 

The G+C content of the DNA is 66.06%. The 16S 
rRNA and genome sequences are deposited in 
GenBank and EMBL under accession numbers 
KC430940 and CBMB010000001-CBMB0100000- 
11, respectively. The type strain IIH3 T (=CSUR 
P3036 = DSM 27562) was isolated from an 
evaporitic sediment in El Golea Lake, Algeria. 
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